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Administrivia 

• PS2 is out. But I was late. So we pushed the 
due date to Wed Sept 24 th , 11:55pm. 


• There is still *no* grace period. To avoid 
confusion, the submission site will close at the 
time it’s due. 

* If you miss it you can send email to me and the TAs and 
plead your case. Pretty soon we will have no sympathy 
for presuming that everything works. 


Read; FP chapter 7 
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Stereo: A Special case of Multiple views 



Hartley and Zisserman 



Multi-view geometry, 
matching, invariant 
features, stereo vision 
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Why multiple views? 


• Structure and depth are inherently ambiguous 
from single views. 



Images from Lana Lazebnik 
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Why multiple views? 

• Structure and depth are inherently ambiguous 
from single views. 
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How do we see depth? 

• What cues help us to perceive 3d shape and 
depth? 

• What about one eye first? 
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Perspective effects 


Stereo: Disparity and Matching 



S. Seitz 
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Shading 




K. Grauman 
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Texture 




[From A.M. Loh. The recovery of 3-D structure using visual texture patterns. PhD thesis] 
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Focus/defocus 



Images from 
same point 
of view, 
different 
camera 
parameters 


3d shape / 

depth 

estimates 


[figs from H. Jin and P. Favaro, 2002] 
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Motion 



Figures from L. Zhang 


http://www.brainconnection.com/teasers/?main=illusion/motion-shape 
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Estimating scene shape from one eye 

• “Shape from X”: Shading, Texture, Focus, Motion... 

• Very popular circa 1980 
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But we (and lots of creatures) have two 
eyes! 

• Stereo: 

• shape from “motion” between two views 

• infer 3d shape of scene from two (multiple) images 
from different viewpoints 

Main idea: scene point 


optical 
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Stereo photography and stereo viewers 


Take two pictures of the same subject from two slightly 
different viewpoints and display so that each eye sees only one 
of the images. 



Invented by Sir Charles Wheatstone 
1838 
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People fascinated by 3D 
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© Copyright 2001 Johns on-Shaw Stereoscopic Museum 
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Teesta suspension bridge-Darjeeling, India 





Mark Twain at Pool Table", no date, UCR Museum of Photography 
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Stereo photography and stereo viewers 
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Stereo photography and stereo viewers 



When I g 


\ You guys.. 


tnreo PicturE 
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The Basic Idea: Two slightly different 
images 



http ://www. well. com/~j img/stereo/stereo_list.html 
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So how do humans do it? 
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Random dot stereograms 

• Julesz 1960: Do we identify local brightness 
patterns before fusion (monocular process) or 
after (binocular)? 

• To test: pair of synthetic images obtained by 
randomly spraying black dots on white objects 
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Random dot stereograms 




Forsyth & Ponce 
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Random dot stereograms 
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Random dot stereograms 

• When viewed monocularly, they appear random; 
when viewed stereoscopical ly, see 3d structure. 

• Conclusion: human binocular fusion not based 
upon matching large scale structures or any 
processing of the individual images 

• Imaginary “cyclopean retina” that combines the 
left and right image stimuli as a single unit. Later 
discovered the cells in the brain’s visual cortex 
that create this “percept” 
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Estimating depth with stereo 

• Stereo: shape from “motion” between two views 

• We’ll need to consider: 

• Info on camera pose (“calibration”) 
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Estimating depth with stereo 


• Stereo: shape from “motion” between two views 

• We’ll need to consider: 


. Info on camera pose (“calibration”) 
. Image point correspondences 


scene point 
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Geometry for a simple stereo system 


First, assuming parallel 
optical axes, known 
camera parameters 
(i.e., calibrated 
cameras) 

Figure is looking down 
on the cameras and 
image planes 

Baseline B, 
focal length f 

Point P is distance Z in 
camera coordinate 
systems 



COP, 


COP 


R 
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Geometry for a simple stereo system 
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Geometry for a simple stereo system 



Disparity ... is inversely 

proportional to depth 
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Depth from disparity 


image I(x,y) 


Disparity map D(x,y) image \'{x ,y') 



(x',y>(x+D(x,y), y) 


So if we could find the corresponding points in two images, we 
could estimate relative depth. . . 
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Depth from disparity 


image I(x,y) Disparity map D(x,y) image \'{x ,y') 



(x',y>(x+D(x,y), y) 


So if we could find the corresponding points in two images, we 
could estimate relative depth. . . 
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General case, with calibrated cameras 

• The two cameras need not have parallel optical axes and 
image planes. 




Vs. 
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Stereo correspondence constraints 



• Given p in left image, where can corresponding point 
p’ be? 
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Stereo correspondence constraints 



the line containing the center of projection and the 
point p in the left image must project to a line in the 
right image. 
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Epipolar constraint 


p 



Geometry of two views constrains where the corresponding pixel 
for some image point in the first view must occur in the second 


view. 
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Epipolar constraint 


p 



Geometry of two views constrains where the corresponding pixel 
for some image point in the first view must occur in the second 
view. 

• It must be on the line carved out by a plane - the 
epipolar plane - connecting the world point and optical 
centers. 
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Epipolar geometry: terms 

• Baseline: line joining the 
camera centers 

• Epipole: point of intersection 
of baseline with image plane 

• Epipolar plane: plane 
containing baseline and world 
point 

• Epipolar line: intersection of 
epipolar plane with the image 
p ane 

• All epipolar lines intersect at the epipole 

• An epipolar plane intersects the left and right image 
planes in epipolar lines 

Why is the epipolar constraint useful ? 
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Epipolar constraint 



This is useful because it reduces the correspondence problem to 
a ID search along an epipolar line. 


Image from Andrew Zisserman 
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What do the epipolar lines look like? 





2 . 





CS 4495 Computer Vision -A. Bobick 


Stereo: Disparity and Matching 


What do the epipolar lines look like? 
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Example: converging cameras 




Figure from Hartley & Zisserman 
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Example: parallel cameras 



Figure from Hartley & Zisserman 
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For now assume parallel image planes... 

• Assume parallel image planes... 

• Assume same focal lengths... 

• Assume epipolar lines are horizontal... 

• Assume epipolar lines are at the same y location 
in the image... 

• That’s a lot of assuming, but it allows us to move 
to the correspondence problem - which you will 
be solving! 



CS 4495 Computer Vision -A. Bobick 


Stereo: Disparity and Matching 


Correspondence problem 



• Hypothesis 1 
o Hypothesis 2 
o Hypothesis 3 


Left image 


Right image 


Multiple match 
hypotheses 
satisfy epipolar 
constraint, but 
which is correct? 


Figure from Gee & Cipolla 1999 
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Correspondence problem 

• Beyond the hard constraint of epipolar geometry, 
there are “soft” constraints to help identify 
corresponding points 

• Similarity 

• Uniqueness 

• Ordering 

• Disparity gradient 

• To find matches in the image pair, we will assume 

• Most scene points visible from both views 

• Image regions for the matches are similar in 
appearance 
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Dense correspondence search 



For each epipolar line 

For each pixel / window in the left image 

• compare with every pixel / window on same epipolar line 
in right image 

• pick position with minimum match cost (e.g., SSD, 
normalized correlation) 


Adapted from Li Zhang 
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Correspondence search with similarity constraint 


Left Right 



• Slide a window along the right scanline and 
compare contents of that window with the 
reference window in the left image 

• Matching cost: SSD or normalized correlation 
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Correspondence search with similarity constraint 



SSD 
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Correspondence search with similarity constraint 



Norm, corr 
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Correspondence problem 



Intensity 

profiles 




• Clear correspondence between intensities, but also noise and ambiguity 


Source: Andrew Zisserman 
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Correspondence problem 



Neighborhoods of corresponding points are 
similar in intensity patterns. 


Source: Andrew Zisserman 
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Correlation-based window matching 
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Correlation-based window matching 




■M 



left image band (x) 
right image band (yl) 
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Correlation-based window matching 



ti 

i 


1 


83 

1 



in left image band (x) 

I - right image band (x 1 ) 



cross 

correlation 


disparity = x 1 - x 
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Correlation-based window matching 




target region 



left image band (x) 


right image band (x') 
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Correlation-based window matching 




target region 



left image band (x) 
right image band (x 1 ) 


1 


999 



cross 

correlation 

Textureless regions are 
non-distinct; high 
ambiguity for matches. 
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Effect of window size 




Source: Andrew Zisserman 
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Effect of window size 



W = 3 W = 20 


Want window large enough to have sufficient intensity 
variation, yet small enough to contain only pixels with 
about the same disparity. 


Figures from Li Zhang 
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Correspondence problem 

• Beyond the hard constraint of epipolar geometry, 
there are “soft” constraints to help identify 
corresponding points 

* Similarity 

• Disparity gradient - depth doesn’t change too quickly. 

• Uniqueness 

* Ordering 
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Uniqueness constraint 

• Up to one match in right image for every point in left 
image 



Figure from Gee & 
Cipolla 1999 
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Problem: Occlusion 

• Uniqueness says “up to match” per pixel 

• When is there no match? 



Occluded pixels 
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Ordering constraint 

• Points on same surface (opaque object) will be in same 
order in both views 



Figure from Gee & 
Cipolla 1999 
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Ordering constraint 

• Won’t always hold, e.g. consider transparent object, or 
an occluding surface 



Figures from Forsyth & Ponce 
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Stereo results 

• Data from University of Tsukuba 

• Similar results on other images without ground truth 



Scene 


Ground truth 
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Results with window search 



Window-based matching 
(best window size) 


Ground truth 
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Better solutions 

• Beyond individual correspondences to estimate 
disparities: 

• Optimize correspondence assignments jointly 

• Scanline at a time (DP) 

• Full 2D grid (graph cuts) 
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Scanline stereo 

• Try to coherently match pixels on the entire 
scanline 



fil 1 1 1 

0 1 DO 200 300 400 EDO 


Right image 



. A 1 1 1 , 

o ion im 300 4no m 
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“Shortest paths” for scan-line stereo 


Right 

occlusion 



one-to-one 


Left 
occlusion 


- Riaht imaae 

V „ 

i ' 

•W 

_/ V, 

- 

9 

w 


Right 
occlusion 



Can be implemented with dynamic programming 

Ohta & Kanade ’85, Cox et al. ’96, Intille & Bobick, ‘01 


Slide credit: Y. Boykov 
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Coherent stereo on 2D grid 

• Scanline stereo generates streaking artifacts 



• Can’t use dynamic programming to find 
spatially coherent disparities/ 
correspondences on a 2D grid 
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Stereo as energy minimization 



• What defines a good stereo correspondence? 

1 . Match quality 

• Want each pixel to find a good match in the other image 

2. Smoothness 

• If two pixels are adjacent, they should (usually) move 
about the same amount 
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Stereo matching as energy minimization 



E = a E da „ (/„/,, D) + fiE sm00tlI (D) 



Ed,„=Z( w i©-W 2 a + D(0)f 

i 


E smh = 2>(D(0 - DO')) 

neighbors i, j 


• Energy functions of this form can be minimized 
using graph cuts 


Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate 
Enerqy Minimization via Graph Cuts, PAMI 2001 Source: Steve Seitz 
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Better results... 



State of the art method Ground truth 

Boykov et al., Fast Approximate Energy Minimization via Graph Cuts. 

International Conference on Computer Vision, September 1999. 


For the latest and greatest: http://www.rniddleburv.edu/stereo/ 
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Challenges 

• Low-contrast ; textureless image regions 

• Occlusions 

• Violations of brightness constancy (e.g., specular 
reflections) 

• Really large baselines (foreshortening and 
appearance change) 

• Camera calibration errors 



